Accelerated Reinforcement Learning

نویسنده

  • K. Lakshmanan
چکیده

Policy gradient methods are widely used in reinforcement learning algorithms to search for better policies in the parameterized policy space. They do gradient search in the policy space and are known to converge very slowly. Nesterov developed an accelerated gradient search algorithm for convex optimization problems. This has been recently extended for non-convex and also stochastic optimization. We use Nesterov’s acceleration for policy gradient search in the well-known actor-critic algorithm and show the convergence using ODE method. We tested this algorithm on a scheduling problem. Here an incoming job is scheduled into one of the four queues based on the queue lengths. We see from experimental results that algorithm using Nesterov’s acceleration has significantly better performance compared to algorithm which do not use acceleration. To the best of our knowledge this is the first time Nesterov’s acceleration has been used with actor-critic algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Case Based Heuristics to Speed up Reinforcement Learning

The aim of this work is to combine three successful AI techniques –Reinforcement Learning (RL), Heuristics Search and Case Based Reasoning (CBR)– creating a new algorithm that allows the use of cases in a case base as heuristics to speed up Reinforcement Learning algorithms. This approach, called Case Based Heuristically Accelerated Reinforcement Learning (CB-HARL), builds upon an emerging tech...

متن کامل

The Use of Cases as Heuristics to speed up Multiagent Reinforcement Learning

This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Multiagent Reinforcement Learning algorithms, combining Case Based Reasoning (CBR) and Multiagent Reinforcement Learning (MRL) techniques. This approach, called Case Based Heuristically Accelerated Multiagent Reinforcement Learning (CB-HAMRL), builds upon an emerging technique, Heuristic Acce...

متن کامل

Argumentation Accelerated Reinforcement Learning for RoboCup Keepaway-Takeaway

Multi-Agent Learning (MAL) is a complex problem, especially in real-time systems where both cooperative and competitive learning are involved. We study this problem in the RoboCup Soccer KeepawayTakeaway game and propose Argumentation Accelerated Reinforcement Learning (AARL) for this game. AARL incorporates heuristics, represented by arguments in Value-Based Argumentation, into Reinforcement L...

متن کامل

Improving Reinforcement Learning by Using Case Based Heuristics

This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Reinforcement Learning algorithms, combining Case Based Reasoning (CBR) and Reinforcement Learning (RL) techniques. This approach, called Case Based Heuristically Accelerated Reinforcement Learning (CB-HARL), builds upon an emerging technique, the Heuristic Accelerated Reinforcement Learning ...

متن کامل

Heuristically Accelerated Reinforcement Learning: Theoretical and Experimental Results

Since finding control policies using Reinforcement Learning (RL) can be very time consuming, in recent years several authors have investigated how to speed up RL algorithms by making improved action selections based on heuristics. In this work we present new theoretical results – convergence and a superior limit for value estimation errors – for the class that encompasses all heuristicsbased al...

متن کامل

Case-Based Multiagent Reinforcement Learning: Cases as Heuristics for Selection of Actions

This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Multiagent Reinforcement Learning algorithms, combining Case-Based Reasoning (CBR) and Multiagent Reinforcement Learning (MRL) techniques. This approach, called Case-Based Heuristically Accelerated Multiagent Reinforcement Learning (CB-HAMRL), builds upon an emerging technique, Heuristic Acce...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.08070  شماره 

صفحات  -

تاریخ انتشار 2017